Information  Filtering  for  Mobile  Augmented  Reality 


Simon  Julier 
Marco  Lanzagorta 
Yohan  Baillot 
Lawrence  Rosenblum* 

Advanced  Information  Technology 
Naval  Research  Laboratory 
Washington  DC 

{julier, Inzgrt, baillot, rosenblu}  @  ait.nrl.navy.mil 


Steven  Feiner 
Tobias  Hollerer 


Dept,  of  Computer  Science 
Columbia  University 
New  York,  NY  10027 
{htobias, feiner}  @  cs.  Columbia,  edu 


Sabrina  Sestito 

Defence  Science  and  Technology  Organisation 
506  Lorimer  St., 

Port  Melbourne  3207 
AUSTRALIA 

sabrina.sestito@dsto.defence.gov.au 


Abstract 

Augmented  reality  is  a  potentially  powerful  paradigm  for 
annotating  the  environment  with  computer-generated  mate¬ 
rial.  These  benefits  will  be  even  greater  when  augmented 
reality  systems  become  mobile  and  wearable.  However,  to 
minimize  the  problem  of  clutter  and  maximize  the  effective¬ 
ness  of  the  display,  algorithms  must  be  developed  to  select 
only  the  most  important  information  for  the  user.  In  this 
paper  we  describe  a  region-based  information  filtering  al¬ 
gorithm.  The  algorithm  takes  account  of  the  state  of  the 
user  (location  and  intent)  and  the  state  of  individual  objects 
about  which  information  can  be  presented.  It  can  dynami¬ 
cally  respond  to  changes  in  the  environment  and  the  user’s 
state.  We  also  describe  how  simple  temporal,  distance  and 
angle  cues  can  be  used  to  refine  the  transitions  between  dif¬ 
ferent  information  sets. 


1  Introduction 

Augmented  reality  (AR)  integrates  virtual  information 
with  the  user’s  physical  environment.  Graphics-based  AR 
can  provide  a  user  with  a  “heads  up  display”  in  which  com¬ 
puter  graphics  is  spatially  registered  with,  and  overlaid  on, 
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geographic  locations  and  real  objects.  Experimental  AR 
systems  have  been  demonstrated  in  a  range  of  potential  ap¬ 
plications,  from  aircraft  manufacture  [7]  to  image  guided- 
surgery  [11],  and  from  maintenance  and  repair  [9,  13]  to 
building  construction  [23].  Improvements  in  portable  com¬ 
puting  hardware,  position  and  orientation  trackers,  and  see- 
through  displays  promise  to  make  wearable  AR  a  commer¬ 
cial  reality  this  decade. 

If  a  graphics-based  AR  system  is  to  be  effective,  care 
must  be  taken  to  ensure  that  its  display  is  not  cluttered  with 
too  much  information.  This  problem  is  illustrated  in  Fig¬ 
ure  3(a),  which  is  imaged  through  the  see-through  head- 
worn  display  of  an  experimental  mobile  AR  system  cur¬ 
rently  under  development  at  the  Naval  Research  Labora¬ 
tory  [3].  The  system  presents  all  the  data  that  it  has  about 
the  environment.  The  resulting  display  is  highly  cluttered 
and  many  of  the  labels  and  wire  frame  diagrams  obscure 
both  one  another  and  the  environment.  Such  clutter  can  un¬ 
dermine  the  effectiveness  of  an  AR  display  [21]. 

One  way  to  address  this  problem  is  through  information 
filtering.  Information  filtering  means  culling  the  informa¬ 
tion  that  can  potentially  be  displayed  by  identifying  and  pri¬ 
oritizing  the  information  that  is  relevant  to  a  user  at  a  given 
point  in  time.  In  the  case  of  AR  or  other  situated  user  in¬ 
terfaces  [15,  8]  that  take  into  account  the  user’s  location, 
information  can  be  classified  based  on  the  user’s  physical 
context,  as  well  as  on  their  current  tastes  and  objectives.  In¬ 
formation  filtering  is  a  key  component  of  environment  man- 
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agement  [17],  a  term  that  we  have  previously  used  to  refer 
to  the  task  of  managing  the  large  number  of  displays,  in¬ 
teraction  devices,  and  virtual  objects  with  which  a  user  will 
interact  in  a  world  populated  with  ubiquitous  mobile  and 
wearable  computers. 

In  this  paper,  we  consider  the  problem  of  information 
filtering  for  mobile  augmented  reality  and  describe  an  in¬ 
formation  filtering  algorithm  that  we  have  developed.  We 
begin  by  describing  our  application  scenario  in  Section  2 
and  previous  work  in  Section  3.  Next,  we  discuss  our  in¬ 
formation  hltering  mechanism  in  Section  4,  and  our  current 
implementation  of  the  algorithm  in  Section  5.  Finally,  Sec¬ 
tion  6  presents  our  conclusions  and  future  work  that  we  in¬ 
tend  to  explore. 

2  Application  Scenario 

Our  goal  is  to  develop  software  systems  and  interaction 
techniques  to  support  multiple,  mobile,  collaborating  users 
with  wearable  AR  systems.  These  users  would  interact  with 
other  users  of  stationary  VR,  AR,  and  desktop  systems.  We 
consider  the  following  basic  application  scenario  [14]: 

•  Multiple  users  with  mobile  AR  systems  are  free  to 
roam  through  an  urban  environment.  Each  user  per¬ 
forms  one  or  more  tasks  (such  as  “follow  a  route  be¬ 
tween  two  specified  points”),  which  can  be  acted  upon 
sequentially  or  concurrently. 

•  The  AR  systems  help  users  accomplish  their  tasks  by 
providing  them  with  relevant  information  about  their 
environment.  For  example,  this  might  include  names 
and  other  properties  of  buildings  and  infrastructure  that 
may  or  may  not  be  directly  visible  from  a  user’s  current 
location. 

•  Users  can  interact  with  the  information  presented  to 
them;  for  example,  by  creating  annotations  that  can  be 
attached  to  locations  or  objects. 

•  Collaboration  among  mobile  users  can  be  achieved 
through  sharing  information;  for  example,  by  mobile 
users  exchanging  annotations. 

•  A  supervisory  “base  station”  (e.g.,  a  stationary  multi¬ 
user  virtual  environment  system)  oversees  the  actions 
of  the  mobile  users.  Base  station  users  receive  in¬ 
formation  from  mobile  users  and  can  send  them  ad¬ 
ditional  information  about  the  environment  and  their 
tasks. 

This  is  an  extremely  general  scenario  that  is  relevant  to  a 
wide  range  of  applications,  including  field  maintenance  or 
sales,  law  enforcement,  the  military,  utility  and  emergency 
services,  and  even  tourism. 


Our  groups  have  developed  two  mobile  AR  systems, 
one  of  which  is  shown  in  Figure  1.  Each  system  is  com¬ 
posed  of  6DOF  trackers  (an  Ashtech  GG  Surveyor  real¬ 
time-kinematic  GPS  for  position,  an  InterSense  IS300Pro 
for  orientation),  a  see-through  head-worn  display  (Sony 
FDI-DIOOB  Glasstron),  a  wireless  network  (a  FreeWave  ra¬ 
dio  modem),  and  a  wearable  computer  with  3D  hardware 
graphics  acceleration  (using  either  ABX  or  PC  104  form 
factor  [3]).  Our  current  test  datasets,  of  approximately  30 
buildings,  include  about  150  objects.  The  types  of  ob¬ 
jects  include  buildings,  windows,  doors  and  tunnels.  Future 
models  are  likely  to  contain  at  least  an  order  of  magnitude 
more  objects. 

Given  the  number  and  the  density  of  the  objects,  infor¬ 
mation  filtering  is  vital  to  prevent  cluttering  the  display.  An 
informal  domain  analysis  of  our  application  scenario  sug¬ 
gested  to  us  that  the  filtering  mechanism  should  take  into 
account  the  following  properties: 

•  Users  will  perform  a  broad  range  of  tasks,  from  main¬ 
taining  general  situational  awareness  of  their  environ¬ 
ment,  to  searching  for  specific  objects,  to  attending  to 
a  specific  set  of  objects  involved  in  an  activity. 

•  Any  object,  of  any  type,  at  any  point  in  time,  can  be¬ 
come  sufficiently  important  that  it  must  be  able  to  pass 
the  filtering  criteria. 

•  Certain  objects  are  important  to  all  users  at  all  times. 

•  Certain  objects  are  important  to  all  users  whenever 
they  are  performing  a  particular  task. 

•  Some  objects  (such  as  the  way  points  that  define  a 
route)  are  only  important  to  the  activities  of  a  partic¬ 
ular  user. 

•  All  things  being  equal,  the  amount  of  information 
shown  to  a  user  about  an  object  is  inversely  propor¬ 
tional  to  the  distance  of  that  object  from  the  user.  For 
example,  at  a  sufficient  distance,  the  only  information 
that  might  be  shown  about  a  building  would  be  a  text 
label.  As  the  user  approaches  the  building,  more  infor¬ 
mation  might  appear  (e.g.,  locations  of  main  entrances 
and  exits).  Finally,  at  very  close  distance,  the  user 
might  be  shown  information  about  the  physical  con¬ 
tents  of  the  building  (e.g.,  a  floor  plan). 

3  Previous  Work 

Filtering  crowded  information  displays  to  prevent  clutter 
and  improve  human  performance  or  rendering  performance 
has  long  been  recognized  as  a  potential  problem  for  infor¬ 
mation  display  systems  [19].  Control  of  the  level  of  detail 
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Figure  1 .  Prototype  mobile  AR  system. 


at  which  an  object  is  viewed  (in  the  limiting  case,  determin¬ 
ing  whether  it  is  even  visible),  based  strictly  on  its  distance 
from  the  viewpoint  or  the  size  of  its  projection,  is  supported 
in  many  current  3D  modeling  languages,  and  was  originally 
developed  for  early  3D  vector  graphics  systems  [22].  For 
example,  in  VRML  97  [6]  and  Java3D  [20],  different  ver¬ 
sions  of  an  object’s  geometry  can  each  be  associated  with  a 
specific  distance  range  between  the  viewpoint  and  object. 

Fish-eye  views  [12]  provide  a  general  approach  to  fil¬ 
tering  object  display  based  on  a  combination  of  (spatial  or 
conceptual)  distance  of  an  object  from  one  or  more  focal 
points  and  some  measure  of  an  object’s  a  priori  importance. 

The  spatial  model  of  interaction  [4]  treats  awareness 
and  interaction  in  multi-user  virtual  environments,  where 
awareness  can  be  used  to  determine  whether  or  not  an  ob¬ 
ject  is  visible  to,  or  capable  of  interaction  with,  another  ob¬ 
ject.  In  this  model,  each  object  (e.g.,  a  user),  is  surrounded 
by  a  focus,  specific  to  a  medium  (e.g.,  graphics  or  sound), 
which  defines  the  part  of  the  environment  of  which  the  ob¬ 
ject  is  aware  in  that  medium.  Each  object  in  the  environ¬ 
ment  also  has  a  medium-specific  nimbus,  which  demarcates 
the  space  within  which  other  objects  can  be  aware  of  that 
object.  In  the  general  spatial  model  of  interaction,  each  ob¬ 
ject  has  a  different  focus  and  nimbus  for  each  medium  sup¬ 
ported  by  the  system,  nimbi  and  foci  can  be  of  arbitrary  size 
and  shape  (e.g.,  asymmetric  or  disjoint)  and  may  be  discrete 
or  continuous,  and  the  awareness  that  object  A  has  of  ob¬ 


ject  B  in  a  particular  medium  is  some  function  of  A’s  focus 
and  B‘s  nimbus  in  that  medium.  Specific  examples  of  the 
model  have  been  implemented  in  the  MASSIVE  and  DIVE 
systems  [5],  which  take  different  approaches  to  computing 
awareness.  Eor  example,  in  DIVE,  awareness  is  a  binary 
function,  where  A  is  aware  of  B  if  A’s  focus  overlaps  with 
B’s  nimbus.  In  contrast,  in  MASSIVE,  foci  and  nimbi  are 
scalar  fields  radiating  from  point-sized  objects,  focus  and 
nimbus  values  are  sampled  at  each  object’s  position,  and 
A’s  level  of  awareness  of  B  is  the  product  of  B’s  value  in 
A’s  focus  and  A’s  value  in  B’s  nimbus. 


Several  researchers  have  addressed  the  problem  of  filter¬ 
ing  overlaid  information  for  AR.  KARMA  [9]  uses  a  rule- 
based  approach  to  select  relevant  information  to  assist  a  user 
performing  a  maintenance  and  repair  task.  The  user’s  po¬ 
sition  and  orientation,  inter-object  occlusion  relationships, 
and  the  role  that  the  objects  play  in  a  specific  task  to  be 
accomplished  by  the  user,  all  determine  whether  and  how 
objects  should  be  displayed,  highlighted,  and  labeled  on  a 
tracked,  see-through,  head-worn  display.  Although  imple¬ 
mented  in  a  stand-alone  VRML  browser,  rather  than  in  an 
AR  system,  InfoLOD  [18]  determines  whether  and  how  to 
label  buildings  in  a  virtual  cityscape.  In  InfoLOD,  informa¬ 
tion  is  associated  with  specific  sides  of  cuboidal  buildings; 
visibility  decisions  are  based  on  a  building’s  distance  from 
the  viewer  and  on  the  building’s  orientation  relative  to  the 
viewer,  making  it  possible  to  to  treat  information  associated 
with  different  sides  differently. 


Some  of  the  approaches  that  we  have  surveyed  here 
are  just  high-level  techniques;  the  distance-based  level-of- 
detail  support  of  VRML  97  and  Java3D,  fisheye  views,  and 
the  abstract  spatial  model  of  interaction.  Therefore,  a  num¬ 
ber  of  implementation  decisions  would  need  to  be  made  to 
instantiate  any  of  these.  The  remaining  approaches  cor¬ 
respond  to  actual  implementations:  KARMA’s  rule-based 
technique,  the  specific  versions  of  the  spatial  model  of  in¬ 
teraction  used  in  MASSIVE  and  DIVE,  and  InfoLOD.  Of 
these,  KARMA’s  rule  set  is  designed  to  accommodate  phys¬ 
ical  tasks  that  require  the  display  of  a  relatively  small  num¬ 
ber  of  objects  that  are  directly  related  to  the  task  to  be  per¬ 
formed,  and  does  not  address  the  wider  range  of  tasks  re¬ 
quired  of  our  users.  InfoLOD  does  not  take  into  account 
the  user’s  current  task  or  object  importance  metrics  at  all, 
while  MASSIVE  and  DIVE  can  through  varying  the  size 
and  shape  of  foci  and  nimbi.  However,  none  of  these  im¬ 
plementations  of  the  spatial  model  of  interaction  deal  with 
the  specifics  of  how  object  importance  metrics  and  task  in¬ 
formation  can  be  incorporated,  or  are  designed  for  the  AR 
information  overlay  tasks  in  which  we  are  interested. 
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4  A  Region-Based  Information  Filter 

4.1  The  Use  of  Subjective  and  Objective  Proper¬ 
ties 

The  information  filtering  system  should,  at  any  given 
time,  only  show  information  which  is  important  to  the  user. 
However,  the  importance  of  a  piece  of  information  depends 
on  the  user’s  current  context.  More  specifically,  we  assume 
that  each  user  is  assigned  a  series  of  tasks.  Each  task  re¬ 
quires  that  the  user  interacts  with  a  set  of  objects  in  certain 
ways.  To  model  these  effects,  we  assume  that  each  user  can 
interact  with  objects  through  a  set  of  medium  and  that  each 
user  and  each  object  possesses  both  objective  and  subjective 
properties. 

The  concept  of  a  medium,  defined  in  the  spatial 
model  [4],  describes  the  way  in  which  a  user  interacts  with 
an  object.  The  original  implementations  in  DIVE  and  MAS¬ 
SIVE  assumed  that  only  three  types  of  media  were  avail¬ 
able:  audio,  text  and  graphics.  In  a  multi-user  AR  system, 
where  a  user  interacts  with  the  real-world,  a  much  greater 
range  of  media  types  exist.  Because  each  type  of  media 
has  different  physical  properties,  it  has  an  impact  on  the  im¬ 
portance  of  an  object.  One  example  of  a  medium  is  wireless 
communications.  Eor  two  users  to  exchange  data,  they  must 
be  within  the  transmission  range  of  their  systems.  Another 
example  is  that  some  tasks  require  a  user  to  physically  ma¬ 
nipulate  the  object.  Because  the  user  has  to  be  able  to  touch 
the  object,  the  range  over  which  the  interaction  can  occur  is 
highly  limited. 

Objective  properties  are  the  same  for  all  users,  irrespec¬ 
tive  of  the  tasks  which  that  user  is  carrying  out.  Such 
properties  include  the  object’s  classification  (for  example 
whether  it  is  a  building  or  an  underground  pipe),  its  location, 
its  size  and  its  shape.  This  can  be  extended  by  noting  that 
many  types  of  objects  have  an  impact  zone  —  an  extended 
region  over  which  an  object  has  a  direct  physical  impact. 
A  wireless  networking  system  such  as  the  WaveLAN,  for 
example,  has  a  finite  transmission  range.  This  region  can 
be  represented  as  a  sphere  whose  radius  equals  the  max¬ 
imum  reliable  transmission  range.  Conversely,  a  more  ac¬ 
curate  representation  could  take  account  of  the  masking  and 
multi -path  effects  of  buildings  and  terrain  through  modeling 
the  impact  zone  as  a  series  of  interconnected  volumes.  Be¬ 
cause  of  their  differing  physical  properties,  different  media 
can  have  different  impact  zones. 

Subjective  properties  attempt  to  encapsulate  the  domain- 
specific  knowledge  of  how  a  particular  object  relates  to  a 
particular  task  for  a  particular  user.  Therefore,  they  vary 
between  users  and  depend  on  the  user’s  task  and  context. 
We  propose  to  represent  this  data  using  an  importance  vec¬ 
tor.  The  importance  vector  stores  the  relevance  of  an  object 
with  respect  to  a  set  of  domain-specific  and  user-scenario 


specific  criteria.  Eor  example,  in  a  firefighting  scenario, 
such  criteria  might  include  whether  an  object  is  flammable 
or  whether  a  street  is  wide  enough  to  allow  emergency  vehi¬ 
cles  to  gain  access.  In  general,  the  relevance  is  not  binary¬ 
valued,  but  is  a  continuum  that  is  normalized  to  the  range 
from  0  (irrelevant)  to  1  (highly  relevant).  Eor  example,  for 
the  flammability  criterion,  the  relevance  might  indicate  the 
object’s  combustability. 

Because  the  list  of  criteria  and  the  measure  of  each  ob¬ 
ject  with  those  criteria  is  highly  domain  dependent,  we  as¬ 
sume  that  the  choice  of  the  criteria  and  the  scoring  of  each 
object  with  respect  to  that  criteria  is  carried  out  by  one  or 
more  domain  experts.  In  general,  defining  the  list  of  criteria 
and  evaluating  objects  with  respect  to  those  criteria  is  likely 
to  be  extremely  difficult.  However,  such  expertise  is  avail¬ 
able  in  some  applications  domains.  Eor  example,  the  sniper 
avoidance  system  described  in  the  next  section  relies  on  US 
Army  Training  manuals  that  precisely  codify  building  fea¬ 
tures  and  configurations  [2]. 

The  objective-subjective  property  framework  can  be  ap¬ 
plied  to  model  the  state  of  each  user.  Each  user  has  their 
own  objective  properties  (such  as  position  and  orientation) 
and  subjective  properties  (which  refer  directly  to  the  user’s 
current  tasks).  Analogous  to  the  importance  vector  we  de¬ 
fine  the  task  vector  which  stores  the  relevance  of  a  task  to 
the  user’s  current  activities.  The  use  of  a  vector  means  that 
a  user  can  carry  out  multiple  tasks  simultaneously  and,  by 
assigning  weights  to  those  tasks,  different  priorities  can  be 
assigned.  Eor  example,  at  a  certain  time  a  user  might  be 
given  a  task  to  follow  a  route  between  two  points.  However, 
the  user  is  also  concerned  that  (s)he  does  not  enter  an  unsafe 
environment.  Therefore,  two  tasks  —  route  following  and 
avoiding  unsafe  areas  —  run  concurrently.  The  task  vec¬ 
tor  is  supplemented  by  additional  ancillary  information.  In 
the  route  following  task,  the  system  needs  to  store  the  way 
points  and  the  final  destination  of  the  route. 

We  now  formalize  the  framework. 

4.2  The  Filtering  Framework 

The  state  of  the  jth  user  is  Uj,  where 


The  user’s  objective  state  is  pj  and  the  user’s  task  vector  is 
tj .  The  task  vector  can  be  linked  to  ancillary  task-specific 
information. 

The  user’s  focus,  fj",  is  a  function  of  the  user’s  state  and 
medium  m  and  is  given  by  the  equation 

f]"  =  f(u,-,m).  (2) 

The  state  of  an  object  is  only  fully  defined  with  respect 
to  a  particular  user.  Specifically,  the  state  of  the  Ah  object 
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with  respect  to  the  jth  user  is  xj  where 


The  vector  of  objective  properties  is  o,  and  the  vector  of 
subjective  properties  is  sj.  The  subjective  properties  are 
derived  from  the  user’s  state  and  the  object’s  objective  state 
by  a  domain  expert.  This  relationship  is  captured  in  the 
equation 

s^=s(oj,Uj)  (4) 

where  s-,  •  is  a  function  which  represents  the  domain  ex¬ 
pert’s  analysis  of  the  objective  and  subjective  properties. 

The  nimbus  of  the  ith  object  for  the  jth  user  in  medium 
TO  is  nT™, 

nf™  =  n(xj,Uj,TO).  (5) 

Once  the  focus  and  the  nimbus  regions  have  been  calcu¬ 

lated,  the  level  of  interaction  which  occurs  between  a  given 
focus  and  nimbus,  gf  ™,  must  be  calculated.  This  is  given 
by  the  equation 

gf™  =  l(ff,nr)  (6) 

where  1(-,  •)  is  the  level  of  interaction  function. 

5  A  Sniper  Avoidance  System 

In  this  section,  we  describe  an  implementation  of  the  fil¬ 
tering  algorithm  for  a  sniper  avoidance  application.  In  many 
law  enforcement,  hostage  rescue  and  peace  keeping  mis¬ 
sions  the  threat  of  snipers  cannot  be  ignored.  Snipers,  armed 
with  powerful  and  accurate  weapons,  exploit  the  3D  nature 
of  the  urban  environment.  The  system  described  here  is  in¬ 
tended  to  help  counter  sniper  threats  in  two  ways.  First,  the 
system  provides  safe  routing  through  a  urban  environment 
avoiding  sniper  threats.  Second,  it  provides  information  that 
is  relevant  for  planning  an  operation  to  disarm  a  sniper. 

Task  analysis  from  the  US  Army  Training  Manual  on 
urban  operations  [2]  suggests  that  six  tasks  are  relevant: 

1.  Route:  The  user  needs  to  travel  from  point  A  to  point 
B.  The  system  should  show  the  positions  of  the  snipers 
and  the  user’s  destination  point  at  all  times.  The  sys¬ 
tem  should  also  show  fire  hazards  (an  area  which  is 
well-suited  for  a  sniper  to  set  up  an  ambush)  at  a 
medium  range  from  the  user’s  position.  The  system 
should  also  provide  general  orientation  information 
such  as  the  names  of  streets  and  nearby  buildings. 

2.  Building  Entry:  In  this  task,  a  user  enters  a  building 
where  a  sniper  is  believed  to  be  located.  The  system 
should  show  the  suspected  sniper  positions,  possible 
ambush  sites,  and  the  windows  and  doors  around  in 
the  building  where  the  sniper  is  located. 


3.  Strategical  Planning:  For  this  task  the  user  must  de¬ 
velop  a  broad  plan  of  action  which  might  cover  many 
city  blocks.  The  system  should  provide  the  user  with  a 
broad  view  of  the  environment  which  does  not  neces¬ 
sarily  convey  much  depth.  For  example,  the  user  will 
want  to  see  buildings  but  does  not  need  fine-grained 
data  such  as  the  location  of  individual  windows  or 
doors. 

4.  Tactical  Planning:  The  user  must  develop  a  detailed 
plan  of  action  involving  a  small  area  (typically  only 
a  few  buildings).  This  task  is  invoked  when  a  sniper 
position  has  been  reported  and  the  user  is  developing 
plans  to  negotiate  around  the  target  area.  Fine-grained 
data  such  as  the  positions  of  windows  might  be  used 
by  the  sniper  is  highly  relevant. 

5.  Offensive:  The  user  is  going  to  perform  offensive  ma¬ 
neuvers  to  disarm  the  sniper.  The  system  should  show 
information  about  the  known  snipers  only. 

6.  Defensive:  The  user  is  going  to  perform  defensive  ma¬ 
neuvers  as  the  result  of  a  sniper  attack.  The  position 
of  the  sniper,  buildings  that  offer  good  shelter  and  fire 
hazards  should  all  be  shown. 

Two  types  of  media  can  be  defined.  They  are: 

1 .  Visibility:  Are  two  objects  visible? 

2.  Offensive  Capabilities:  Are  the  users  within  range  of 
a  sniper’s  weapon? 

An  initial  implementation  of  the  filtering  algorithm  has 
been  completed  for  a  single  user  in  a  mobile  environment. 
This  implementation  defines  the  user  state  Uj  (objective 
plus  task  vector),  the  object  state  x,  (objective  state  and 
importance  vector)  and  provides  sample  implementations 
of  Equations  2  to  6.  Because  this  is  a  prototype,  many  of 
the  parameters  have  not  been  fully  defined  from  empirical 
studies  and  so  the  burden  of  setting  the  parameters  has  been 
placed  directly  on  the  user.  To  minimize  this  burden  we  fol¬ 
lowed  the  approach  described  in  [1]  and  provided  the  user 
with  a  set  of  interactive  controls.  These  interactive  controls 
consist  of  a  set  of  sliders  which  are  visible  in  the  user’s  head 
mounted  display  and  can  be  controlled  by  a  trackpad.  Fur¬ 
thermore,  we  only  consider  the  problem  of  working  in  the 
medium  of  offensive  capabilities. 

5.1  User  State 

Objective  properties: 

•  Location:  The  position  and  orientation  of  the  user. 
This  is  measured  directly  by  the  user’s  tracking  sys¬ 
tem. 
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Component 

Meaning 

1 

Route 

2 

Building  entry 

3 

Strategic  planning 

4 

Tactical  planning 

5 

Offensive 

6 

Defensive 

Table  1 .  The  Structure  of  the  Task  Vector 


Subjective  properties: 

•  Task  vector:  This  consists  of  the  current  user’s  goal. 
The  components  are  defined  in  Table  1 . 

•  Focus:  This  is  represented  as  a  square  centered  on  the 
user’s  current  location. 

5.2  Object  State 

Objective  properties: 

•  Type:  An  enumerated  quantity  that  specifies  the  object 
type.  This  can  be  qualified  by  subtype  information  that 
provides  more  data.  For  example,  a  DOOR  object  can 
be  designated  as  a  MAIN_ENTRANCE. 

•  Location:  The  position  and  orientation  of  the  entity. 
Because  the  objects  are  stored  in  a  hierarchy,  an  en¬ 
tity’s  location  is  specified  with  respect  to  the  location 
of  its  parent. 

•  Size:  A  bounding  box.  For  the  i  object  the  lengths  of 

the  sides  are  given  by  the  triple  of  numbers  [vq  ,  . 

•  Impact  zone:  This  is  a  range  over  which  the  object 
has  direct  impact  on  its  environment.  This  is  current 
modeled  as  an  axially-aligned  box  whose  sides  are  of 
length  [zl,zi,zi]. 

Subjective  properties: 

•  Importance  vector:  The  importance  vector  is  six¬ 
dimensional.  The  criteria  are  listed  in  Table  2. 

•  Nimbus:  The  calculated  nimbus  of  the  object.  The  al¬ 
gorithm  for  calculating  the  nimbus  is  described  below. 

5.3  Importance  Vector  Calculation 

The  importance  vector  is  calculated  for  each  object  by 
domain  experts  according  to  Equation  4.  With  this  proto¬ 
type  system,  only  the  following  two  factors  are  utilized; 


•  Taller  buildings  tend  to  confer  a  greater  tactical  advan¬ 
tage  because,  a  user  at  the  top  of  the  building,  has  a 
larger  field  of  view.  This  tends  to  increase  the  weight 
on  criteria  2  (where  sniper  could  prepare  an  ambush). 

•  The  type  of  object  is  used  in  the  strategical  and  tactical 
planning  tasks.  Specifically,  if  the  user  is  in  strategical 
planning  mode  and  the  object  is  a  fine-grained  building 
feature  (window,  door,  etc.)  its  weight  is  reduced. 

5.4  Focus  Calculation 


The  focus  is  calculated  from  Equation  2.  However,  as 
explained,  the  focus  is  currently  determined  manually.  It  is 
a  square  bounding  box  whose  dimensions  can  be  continu¬ 
ously  varied  from  between  Om  and  500m. 

5.5  Nimbus  Determination 


The  nimbus  is  calculated  using  the  following  implemen¬ 
tation  of  Equation  5.  The  nimbus  nT™  is  a  bounding  box. 
The  length  of  the  dth  side  of  this  box  is  and  its  value 
is  given  by 

=  (7) 

The  terms  and  z^^  are  the  length  of  the  size  of  the  object 
and  its  impact  zone  in  the  d  dimension.  The  final  term 
is  a  user-task-object  inflation  term.  To  calculate  its  value, 
we  score  the  importance  vector  against  the  user’s  task  vec¬ 
tor.  To  achieve  this  scoring,  the  task  vector  is  first  projected 
into  importance  vector  space.  We  model  this  projection  as 
the  linear  operation  given  by  the  matrix  M,  where 


M  = 


1  1 
1  1 
1  1 
0  0 
0  0 
1  0 
1  1 


1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 


1  1 
0  1 
0  1 
0  0 
0  0 
0  0 
0  0 


(8) 


The  element  in  the  ith  row  of  the  jth  column  encodes 
whether  the  ith  element  of  the  object’s  importance  vector 
is  important  for  the  jth  task.  Eor  example,  the  fourth  row 
encodes  the  fact  that  the  system  should  show  the  user  the 
possible  location  of  civilians  (the  fourth  element  in  the  im¬ 
portance  vector)  only  in  the  strategic  and  tactical  planning 
modes  (elements  3  and  4  in  the  task  vector). 

The  scoring  is  then  given  by 

=  0.2  min  ^1,  j  .  (9) 
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Component 

Meaning 

1 

Known  sniper  position 

2 

Where  sniper  could  prepare  an  ambush 

3 

Shelter  from  sniper  attacks 

4 

Where  civilians  may  be  gathering 

5 

Where  friendly  forces  are  located 

6 

Way  point  on  a  route  which  a  user  is  following 

7 

Manually  highlighted  by  user 

Table  2.  The  Structure  of  the  Importance  Vector 


Calculate  nimbuses  for  all  objects  with  initial  object 
state  and  user  goals, 
loop  forever  { 

if  (state  of  an  object  has  changed)  { 

Update  nimbus  of  that  object 
Filter  that  object 

} 

if  (user’s  goal  has  changed)  { 

Update  nimbuses  of  all  objects 
Filter  all  objects 

} 

if  (user  position  has  changed  more  than 
a  threshold  distance) 

Filter  all  objects 

} 


Figure  2.  The  pseudo-code  of  the  main  filter¬ 
ing  loop. 


5.6  Focus-Nimbus  Interaction  Calculation 

The  focus-nimbus  interaction  level,  gf  is  calculated 
from  the  function  l(fj",nT™)  in  Equation  6.  We  use  the 
following  approach.  If  the  focus  and  nimbus  boxes  do  not 
overlap  then  gT™  =  0.  If  the  user’s  position  lies  inside  the 
nimbus  then  qj’^  =  1.  However,  if  the  focus  and  nimbus 
overlap  but  the  user’s  position  does  not  lie  inside  the  nim¬ 
bus,  is  1.0  minus  the  minimum  distance  between  the 
perimeter  of  the  nimbus  and  the  user’s  current  location,  di¬ 
vided  by  the  length  of  the  user’s  focus. 

5.7  Filtering  Architecture 

In  Figure  2  we  provide  the  pseudo-code  for  the  main  fil¬ 
tering  loop.  This  algorithm  is  completely  dynamic — it  can 
respond  to  any  changes  to  the  user  or  to  the  entities  in  the 
environment. 


5.8  User  Interaction  and  Display  Cues 

The  filtering  mechanisms  described  in  the  last  subsec¬ 
tions  provide  a  means  for  automatically  calculating  the  im¬ 
portance  of  buildings.  However,  the  question  remains  as 
to  how  this  information  should  be  displayed  to  the  user.  A 
simple  binary  show/no  show  scheme  has  the  problem  that  it 
is  possible  for  configurations  to  arise  within  which  a  small 
change  in  the  user’s  position  can  lead  to  an  extremely  large 
and  disorientating  change  in  the  state  of  the  graphical  dis¬ 
play.  Therefore,  based  on  our  previous  experience  building 
situated  AR  displays  [10],  we  use  the  following  set  of  sim¬ 
ple  display  cues  to  simplify  the  transition  between  different 
information  display  types.  These  cues  are  as  follows: 

1.  Objects  are  faded  in  and  out  rather  than  appearing  and 
disappearing.  The  alpha  value  for  the  ith  object,  aj ,  is 
calculated  by  the  ramping  function 

^  I  HIP  forgf™</? 

*  \  1  elswhere 

where  j3  =  0.3. 

2.  The  user  has  the  capability  to  select  an  object  and  de¬ 
fine  its  importance.  One  element  of  the  importance 
vector  is  “has  the  user  selected  this  object?”.  This  pro¬ 
vides,  in  effect,  a  means  by  which  the  user  can  select 
an  object  and  define  the  importance  of  that  object. 

5.9  Results 

The  effect  of  the  filtering  algorithm  are  demonstrated  in 
Figure  3  which  shows  a  pair  of  images  captures  from  the 
actual  output  of  the  AR  display  mounted  in  a  mannequin 
head  that  wears  the  display.  The  results  show  the  effect 
of  the  system  when  it  is  running  in  the  Tactical  planning 
mode.  In  this  mode,  a  user  sees  detailed  environmental  in¬ 
formation.  The  image  on  the  left  shows  the  output  from  the 
system  when  filtering  is  disabled.  As  can  be  seen  the  im¬ 
age  is  highly  cluttered  —  the  system  is  showing  the  users 
data  about  the  infrastructure  of  buildings  behind  the  cur¬ 
rently  visible  building.  The  effects  of  filtering  are  shown 
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in  the  right  hand  picture.  As  can  be  seen,  the  clutter  has 
been  eliminated.  However,  the  system  has  not  used  a  sim¬ 
ple  fixed-distance  clipping  strategy.  This  is  illustrated  by 
the  fact  that  a  reported  sniper  location  on  a  building  behind 
the  visible  building  is  visible. 

A  better  example  of  the  impact  of  the  filter  can  be  seen  in 
Figure  4,  which  shows  three  frames  grabbed  from  the  frame 
buffer  of  the  mobile  graphics  system.  These  three  frames  il¬ 
lustrate  the  effect  of  different  task  vectors  on  the  user’s  out¬ 
put.  Figure  4(a)  shows  the  effect  when  filtering  is  disabled. 
The  user  is  looking  towards  a  building  in  the  foreground.  A 
sniper  is  visible  in  the  background,  but  is  hard  to  see  due  to 
the  large  amount  of  information  which  is  being  displayed. 
Figure  4(b)  shows  the  view  when  the  user  switches  to  Tacti¬ 
cal  Planning  mode.  The  system  only  shows  the  location  of 
the  sniper  and  the  building  that  the  sniper  is  believed  to  be 
resident  in.  Because  the  building  is  at  some  distance  from 
the  user,  it  is  drawn  more  faintly  to  denote  its  lower  im¬ 
portance.  However,  the  persistent  importance  of  the  sniper 
means  that  it  is  drawn  brightly.  Figure  4(c)  shows  the  same 
view  but  when  the  Route  mode  is  enabled.  In  this  view,  the 
system  shows  the  locations  of  threats  (the  sniper)  and  local 
landmarks  (the  foreground  building). 

Although  we  have  yet  to  perform  formal  user  evaluation 
studies,  user  response  to  the  filtering  algorithm  has  been  ex¬ 
tremely  positive.  Users  have  commented  that  the  algorithm 
eliminates  superfluous  information  and  maintains  critical 
data  which  is  critical  to  a  sniper  avoidance  system. 

In  the  example  shown,  the  system  sustains  20  frames 
per  second  (stereo).  Profiling  reveals  that  the  filtering  al¬ 
gorithm,  implemented  in  Java  on  the  mobile  computer  (a 
266MHz  Pentium  Pro),  completely  filters  an  environment 
of  150  objects  in  less  than  one  millisecond.  This  perfor¬ 
mance  is  sufficient  for  our  current  development  system. 

6  Conclusions  and  Future  Work 

This  paper  has  described  an  automated  information  fil¬ 
tering  algorithm  that  we  use  to  declutter  the  display  of  an 
experimental  mobile  AR  system.  The  algorithm  is  based  on 
the  spatial  model  of  interaction  and  utilizes  a  focus  and  a 
nimbus.  We  described  a  method  for  calculating  the  focus 
and  nimbus  which  decomposes  objects  into  objective  and 
subjective  properties.  We  demonstrated  the  use  of  this  ap¬ 
proach  in  a  sniper  avoidance  system. 

There  are  several  areas  of  further  work  to  be  carried  out: 

•  User  studies  and  detailed  domain  analysis  need  to  be 
carried  out  to  refine  domain  expertise.  This  will  be 
used  to  enhance  the  structure  of  the  information  vector 
and  the  evaluation  of  objects  with  respect  to  the  appro¬ 
priate  criteria. 


•  The  complexity  of  the  environment  model  and  of  the 
criteria  used  to  develop  the  focus  and  nimbus  regions 
will  be  greatly  enhanced.  The  current  implementa¬ 
tion  only  uses  simple  geometric  descriptions  (axially 
aligned  bounding  boxes)  to  model  the  environment  and 
simple  queries  (box  intersections)  are  used.  We  pro¬ 
pose  to  extend  our  algorithm  to  incorporate  line-of- 
sight  and  visibility  constraints  [9]  and  to  use  more  so¬ 
phisticated  intersection  algorithms  [16]. 

•  Future  research  algorithm  can  be  combined  with  dy¬ 
namic  and  flexible  view  management  capabilities. 
Through  the  use  of  mechanisms  such  as  constraint- 
based  layout  control,  new  annotations  can  be  com¬ 
bined.  The  filter  could  be  extended  to  provide  prior¬ 
ities  for  the  types  of  information  which  are  being  dis¬ 
played. 
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(a)  Original  unfiltered  view. 


(b)  Filtering  with  Tactical  Planning  mode  enabled. 


(c)  Filtering  with  Route  model  enabled. 


Figure  4.  Several  images  from  the  backpack  computer  captured  directly  from  the  system’s  frame 
buffer  showing  the  effect  of  several  different  filter  modes.  It  should  be  noted  that  the  user’s  position 
is  different  from  in  Figure  3. 
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